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RENEWALS FOR EXPONENTIALLY INCREASING LIFETIMES, 
WITH AN APPLICATION TO DIGITAL SEARCH TREES 

By Florian Dennert and Rudolf Grubel 

Universitat Hannover 

We show that the number of renewals up to time t exhibits distri- 
butional fluctuations as t — > oo if the underlying lifetimes increase at 
an exponential rate in a distributional sense. This provides a proba- 
bilistic explanation for the asymptotics of insertion depth in random 
trees generated by a bit-comparison strategy from uniform input; 
we also obtain a representation for the resulting family of limit laws 
along subsequences. Our approach can also be used to obtain rates 
of convergence. 

1. Introduction. Let (Yfc)fceN be a sequence of independent, nonnegative 
random variables and let (S n ) ne ^ , 

n 

S :=0, 5 n :=VJy fc for all n € N, 

k=l 

be the associated sequence of partial sums. Regarding the Y^'s as successive 
lifetimes and S n as the time of the nth renewal, we interpret 

iV t :=sup{nGNo:S n <t} 

as the number of renewals up to and including time t; (Nt)t>o is the renewal 
process. Standard renewal theory assumes that the Y^'s all have the same 
distribution, in which case Nt, appropriately rescaled, is asymptotically nor- 
mal as t — > oo. For this result, and for renewal theory in general, we refer 
the reader to Section XI in [3]. 

In this note we consider exponentially increasing lifetimes. We show that 
in such a case the distribution of Nt does not converge and that asymp- 
totic distributional fluctuations appear (Section 2). Such fluctuations occur 
frequently in the analysis of algorithms. The renewal theoretic framework 
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provides a probabilistic view of this phenomenon in connection with digital 
search trees (Section 3). We also indicate how our approach can be used to 
obtain rates of convergence (Section 4). 

2. Renewals for increasing lifetimes. We assume that the lifetimes in- 
crease exponentially with rate a, where a > 1 is fixed throughout the sequel, 
in the sense that 

(1) a~ k Y k ^ distr Yoo and a~ k EY k ^EY 00 

for some random variable Y^ and as k — > oo. Here "— Mistr" denotes conver- 
gence in distribution, so that the first part of (1) means that 

lim Ef(a~ k Y k ) = Ef^) 
k— »oo 

for all bounded continuous functions /:R — >R. Below we will use the fact 
that in order to prove X n — ^distr X it is sufficient to show that E f \X n ) — > 
Ef(X) holds for all bounded and uniformly continuous functions. For details 
and a general treatment of convergence in distribution we refer the reader 
to [1]. Of course, only the distribution fi = C(Y 00 ) of Y"oo matters, so we 
will occasionally write a~ k Y k -^distr M instead. Finally, throughout this note 
a condition involving moments is meant to imply that these moments are 
finite. 

An important role will be played by 

oo 

<Soo := ^ ] a ^O0,fc) 
k=0 

where (Yoo^keNo is a sequence of independent and identically distributed 
random variables with £(^00,0) — £(Xoo), ^oo as in (1). From EYoo < oo we 
obtain ESqq = a(a — l)~ 1 EY OQ < oo and therefore -P(5oo < oo) = 1; more- 
over, we then also have that J2k=o a ~ k ^oo,k converges almost surely and 
hence in distribution to Sqo as n — > oo. We will also assume that C{Yoo) has 
no atoms, that is, 

(2) p(Y OD = y)=0 forallyeR + . 

Finally, it is an elementary analytic fact that, for a sequence (x n ) ne N of real 
numbers with limit 

n-l 

(3) Jim ^2 a ~ kx r 

k=0 



El. OLX 
a~ k x, L . 
— a — 1 



The following lemma can be regarded as a random version of (3). 

Lemma 1. If (1) and (2) are satisfied, then a~ n S n — >distr 5*00 as n — > oo, 
and P(Soo = y) = for all y £ R. 
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Proof. Suppose that (Uk)k<=N is a sequence of independent random 
variables on some probability space (fl,A, P), all uniformly distributed on 
the unit interval. Let F& be the distribution function of Y/%, F the distribution 
function of Y^,. We use a variant of the quantile construction: 

Y k := F^(U k ), Y^k : = F" 1 ^) for all k € N. 
We then have C(Y]_, . . . , Y n ) = £(Y l5 . . . , Y n ) for all n G N, which implies 



£(a- n S n ) = C{a~ n S n ) with S n := £ Y k . 
Using a~ n S n = a~ fc ("~ {n " fc) ^n-fc) we obtain 



fc=i 



(4) E 



n-l 



a 



~ n S n - ^2 a Yoo^-k 



k=0 



n-l 



< a- k E\a~^Y n _ k - Y^ 



k=0 



With Y k ' := F^ l {Ui) and Y^ := F" 1 ^) we have 

(5) Fla-^-y^HFla-^-F^I. 

From (1) it follows that at~~ k Y{. -^distr Y^ and Ea~ k Y k ' — > FY^. Because 
of Y7, > Theorem 5.4 in [1] applies and gives the L 1 -convergence of 
a~ k Y k ' to Y^, that is, E\a~ k Y k ' — Y^| — >■ as — > oo. Using this together 
with (3), (4) and (5) we obtain 



lim F 

n— >oo 



(6) 

Now let / : R - 

\Ef(a- n S n ) - Ef(S, 



n-l 
fc=0 



0. 



be bounded and uniformly continuous. We have 
Ef(a~ n S n ) - Ef (j2 a^Y^k) 



\k=0 



'n-l 



+ Ef[J2 oT k Y^ k - F/ £ oT k Y^ k 
\k=0 J \k 

/n-l 

E f(a- n S n )-f[J2<x~ k Yoc,r, 



,n—k 



\k=0 



+ E 



\k=0 J 



/n-l \ 
\fc=0 / 



For the first integral on the right-hand side we use (6), for the second an 
elementary estimate shows that the difference between the arguments of / 
converges to in probability. In both cases we now use uniform continuity 
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when the arguments of / are close to each other and boundedness otherwise. 
This leads to 

lim Ef(a~ n S n ) = Ef(S 00 ), 

n— >oo 

which gives the convergence in distribution. The statement about the atoms 
of follows from (2) and the fact that Sqo is equal in distribution to 
Yoo + d~ 1 S OQ with Yqo and Soo independent. □ 

The above proof is based on classical weak convergence arguments. An 
alternative proof can be obtained via the Wasserstein distance 

d w (p, v) = -mi{E\X - Y\ : C(X) = fi, C(Y) = u}, 

its known relation to weak convergence and convergence of the first moments, 
and the same variant of the quantile construction, which in this context is 
known as the comonotone coupling. 

We write [^J for the greatest integer less than or equal to x and {x} for 
the fractional part of x G R. 

Theorem 2. Suppose that (1) and (2) are satisfied and let 

(7) Q r ,:=£(L-log Q 5 00 + r ? J), < r, < 1. 

If (^n)neN is a sequence of real numbers with t n — ► oo and such that {log^ t n } - 
r] for some r\ G [0, 1], then 

N t n ~ U°ga tn\ ^distr Q V AS n > OO. 

Proof. We use the abbreviations k n := [loga^nj and rj n := {log a t n }. In 
particular, log Q t n = k n + r\ n . Further, let := — log a Soq. By a standard 
renewal theoretic argument, 

P(N t = j) = P(Sj < t) - P(S j+1 < t) for all t>0,j£ N , 

hence 

P(N tn - k n = j) = P(S kn+J < t n ) - P(S kn+j+1 < t n ) 

= P(-log a (a~ kn ~ j S kn+j )+r, n >j) 

- P(- ]og a (a- k »-*- 1 S kn+j+1 ) + V n > j + 1) 

-> P( [Z^ +v\=j) as n ^ oo, 

where in the last step Lemma 1 and three general facts about convergence 
in distribution were used: First, the continuous mapping theorem, which 
implies that — log Q (a~ m 5 m ) ^distr — loga'S'oo as m — > oo; secondly, the in- 
terplay with convergence in probability, see Theorem 4.1 in [1], which yields 
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- log Q (a n S n ) + rj n ^distr - log Q Soo + 77 as n -> 00; finally, that £(Sco) and 
therefore also £(— log Q .Soo + 77) assign probability to single points and that 
this implies 

lim o P(- log Q (a- n 5 n ) + rj n > z) = P{- \og a + 77 > z) for all z£R. 
□ 

A structural consequence of the representation (7) is the — >distr-continuity 
of 77 1— > on the open unit interval; at 77 = this function is right continuous, 
at 77 = 1 it is left continuous. The extreme members are translates of each 
other in the sense that Qo{{j}) = Qi({j + 1}) for all j G Z. 

The total variation distance c^tv of probability measures is defined by 

(kv(v,v) :=swp\fi(B) -u(B)\, 

B 

for fj,, v concentrated on Z this can be written as 

(8) <hvM = ^M{j})-v({j})\- 

For a sequence of probability measures that are concentrated on a fixed 
countable set Scheffe's lemma implies that weak convergence is equivalent 
to convergence in total variation distance, hence (7) can be rewritten as 

t jv£ B 'hv(£(Nt n - [^g a t n \),Q {logatn} ) = 0. 

Because of the continuity of [0, 1] B 77 1— > this in turn leads to a statement 
that avoids the use of subsequences, 

(9) lim d TV (C(N t - [log a t\ ) , Q {logQ t} ) = 0. 

In Section 4 we will investigate the rate of convergence in (9) in a particular 
case. 

3. An application to digital search trees. The nodes of a (rooted, di- 
rected) binary tree can be represented by finite strings of 0's and l's if we 
interpret as a move to the left and 1 as a move to the right. The length of 
the string is the depth (or level) of the node it represents, the root node corre- 
sponds to the empty string and has level 0. The sequence (T n ) n ^ associated 
with a sequence (x n ) n ^ of numbers from the unit interval by the DST (dig- 
ital search tree) algorithm is obtained as follows: For T±, we put x\ into the 
root node. If x\, . . . ,x n have been stored into T n then the position of x n+ \ is 
determined by traveling through the tree with the direction given by the bi- 
nary expansion of x n + 1 until an empty node has been found. This algorithm 
and its properties are discussed in the standard texts of the area, for exam- 
ple, [8, 10, 11]. As an example we consider the first ten numbers given in [8], 
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Fig. 1. Binary tree. 



Appendix A, (y/2, y/3, y/E, y/W, \¥2, ^3, ^2, log 2, log 3, log 10) . Let x< be the 
fractional part of the iih entry, 1 < i < 10; the relevant first four bits of the 
respective binary expansions are given by (0110, 1011, 0011, 0010, 0100, 0111, 
0011, 1011,0001,0100). This leads to the binary tree given in Figure 1. 

Consider now the sequence (T n ) nS N of random trees that the DST algo- 
rithm associates with a sequence (U n ) n< zfq of independent random variables, 
where we assume that the U n 's are uniformly distributed on the unit inter- 
val and that To is the empty tree. Let X n (9) be the depth of the first free 
node of T n along the path determined by a sequence 9 G {0, 1} N . Such a 9 
defines a family of nested intervals of length 2~ k , k = 1, 2, 3, ... , and it is 
easy to see that (X n (9)) n£ fq is a Markov chain with Xq(9) = and tran- 
sition probabilities Pk,k+i = 1 — Pk,k = 2" fe for all k G No- Conditioning on 
the value of U n +\ we see that the distribution of X n {9) is the same as the 
distribution of Z n+ i, the insertion depth of f/ n +i. This quantity is known 
as "unsuccessful search" in the literature on the analysis of algorithms. [Of 
course, this distributional equality does not hold for the joint distributions: 
n i — ► X n {9) is increasing, n \— > Z n+ \ is not.] For example, the next number 
in Knuth's list is x\\ = l/log2, the binary expansion of the fractional part 
{x\i} begins with 011100 and hence in would be inserted at level 4 as the 
right child of xq. 

The Markov chain (X n (#)) ng pj is of the simple birth type and can there- 
fore be described by its respective holding times Yi, Y2, Y3, ... in the states 

k = 0, 1, 2, These are independent, and has a geometric distribution 

with parameter Pk-i,k, that is, for all k G N, 

P(Y k = j) = (1 - 2- k + 1 y- 1 2~ k+1 for all j G N. 

Here we interpret the case k = 1 as Y\, the holding time in 0, being constant 
and equal to 1. As a result of its simple stochastic dynamics, (A n (#)) ng N 
is equal to the renewal process N associated with the sequence (Ife)fceNj 
observed at discrete times, that is, (X n (9)) ne ^ = (N n ) n& ^ . It is easy to 
see that for this sequence (Yk)k<=N of lifetimes conditions (1) and (2) are 
satisfied and that C(Y OQ ) = Exp(2), with Exp(A) the exponential distribution 
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with parameter A (and mean 1/A). Hence Theorem 2 can be applied: If the 
sequence (n(m)) me pj C N is such that n(m) — ► oo and {log 2 ro(m)} — ► rj as 
m — > oo, then 

(10) -Xn(m) (#) - L lo g2 n(m)J ^distr Qry- 

Here Q v , < 77 < 1, is the distribution of [— log 2 S 00 + r]\ , Sqo := J2fc=o 2 _fc ^oo,fc 
and Yoo^, k £ No, are independent and identically distributed with £(Y"oo,i) = 
Exp(2). Alternatively, we can write := J2T=i ^k with Y k , k £N, again in- 
dependent and C(Y k ) = Exp(2 fe ) for all k £ N. 

The explicit representation of the family of limit distributions on the basis 
of the convolution product of the distributions Exp(2 fe ), k £ N, can be used 
to obtain a series expansion for the distribution functions associated with 
Qr], < i] < 1. For this, we start with a partial fraction expansion: For all 
n £ N and all z £ C with \$l(z)\ < 2, 

n n 

(11) n (i - 2 ~m -1 = e «n.*(i - 2-**)- 1 , 

k=l k=l 

where a njk : = II*=J (1 - 2 j ) _1 II"=i (1 ~ 2~ j ) _1 . Reading (11) as an equality 
relating characteristic functions we obtain 

71 

(12) Exp(2 1 )*Exp(2 2 )*---*Exp(2 n ) = ^ a„ ifc Exp(2 fc ). 

k=l 

Note, however, that the right-hand side in (12) is not the usual mixture of 
probability distributions as the coefficients alternate in sign. With 

k— 1 00 

a k :=b\{{l-V)-\ 6:=n(l-2-J')- 1 , 

3=1 5=1 

letting n — ► 00 in (12) leads to £(5qo) = OfcExp(2 fc ), so that 
Q v ((-oo, x])=P( [- log 2 (5oo) + r)\ < x) 

(13) =P{S 00 >2^ 1 ~ X ) 

00 

= a k exp(-2 fc+,? - 1 - a; ) for all x £ Z. 
fc=i 

This representation of the limiting distribution functions as an alternating 
series has already been obtained by Louchard [9] in the context of digital 
search trees and by Flajolet [4] in the context of approximate counting; see 
also Section 6.4 in [10] and Section 6.3 in [8] for related results. These authors 
use a completely different approach, more analytic in flavor and relying on 
combinatorial identities due to Euler. 
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Our main point here, however, is not a rederivation of (13) but the rep- 
resentation of the family {Q v : < i] < 1} in terms of one particular random 
variable, which is first shifted by r\ and then discretized. This representation 
can, for example, be used to obtain information about the tail behavior of 
the limit distributions. Janson [7] notes that (13) by itself would only give 
an exponential rate of decrease for the tail probabilities, he then provides an 
analytic argument that improves this to a super exponential rate by show- 
ing that the associated Fourier transform is an entire function. Using the 
representation Soo = Y^k=\ ^~ k ^k with independent and C(Zf { ) = Exp(l) 
together with the fact that Exp(l) has a density bounded by 1, we get 

P(Soo < 2~ j ) < P(Zi < 2- j+1 )P{Z 2 < 2~ j+2 ) ■ ■■P{Z j ^ x < 2- 1 ) 

< 2~ j+1 2~ j+2 ---2- 1 
= 2 -i0-i)/2 

for all j G N. Because of Q v ([k, oo)) < P(Soo < 2~ k+1 ) for all k G N, k > 2, 
this leads to 

Q v ([x, oo)) = o(exp(— px 2 )) as x ^ oo, for all p < (log2)/2. 

The fact that a representation by discretization is possible in many situ- 
ations where fluctuations were first found by calculation seems to belong to 
the folklore of the subject, at least in simple instances such as the asymp- 
totic distributional behavior of the maximum of a sample from a geometric 
distribution. The geometric case together with some renewal theoretic tech- 
niques (for identically distributed lifetimes) was used in [5] to obtain results 
of the above type for von Neumann addition. In [2] a discretization represen- 
tation occurs on the level of stochastic processes, leading to a probabilistic 
approach to fluctuation phenomena in the context of multiplicities of the 
maximum in a random sample from a discrete distribution. In a recent pa- 
per, Janson [7] studies the effects of discretizing random variables and the 
resulting distributional fluctuations and gives a range of interesting exam- 
ples. Of course, the explanation for periodicities can be, and indeed often is, 
quite different and mechanisms other than discretization may be responsible; 
see, for example, [6] and the references given there. 

4. Rates of convergence. The renewal theoretic approach can also be 
used to obtain rates of convergence. We sketch one of the possibilities, for 
a particular choice of distances, and give details for the DST situation from 
Section 3. Let, for t > 0, k(t) := [log^, ij and r](t) := {log a t}. 

The Kolmogorov-Smirnov distance of two probability measures p, and v 
on the real line is defined by 

<Iks{p,v) :=sup|//((-oo,a;]) - v{(-oo,x})\. 
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If X and Y are real random variables, then we abbreviate d^{C{X),C{Y)) 
to (Iks(X, Y); if F and 67 are the associated distribution functions, then 
dKs(X,Y) = \\F — 67 1 |oo, where the supremum norm for general bounded 
functions >R is given by ||/||oo := su Pa:e]R The Kolmogorov- 

Smirnov distance is obviously invariant under strictly monotone transfor- 
mations. For example, 

d K s(aX + (3,aY + /3) = d KS (X,Y) for all a,j3£M,Q/0, 

and for X,Y > 0, 

d KS (X,Y) = efcs(logX,logF). 
With the notation as in the proof of Theorem 2, 
\P(N t - k(t) =j) - P(L-log a (5oo) + rj(t)\ =j)\ 

< \P(-log Q (a- k{t) - 3 S k{t)+3 ) + V (t) > j) -P(-log a (5 00 ) + V (t) > j)\ 

+ |P(- \og a {a- m ~ 3 - l S m+3+ i) + m > j + 1) 
-P(-log a (5 00 )+r / (t)>j + l)|. 
With the auxiliary quantities 

Z t := |_- log Q ,(5' 00 ) + rj(t)\ , 4>(m) := d KS (a~ m S m , Soo) 
and the above properties of the Kolmogorov-Smirnov distance this leads to 

(14) \P(N t - k(t)=j) - P(Z t =j)\ < <p{k(t) +j) + <j>(k(t)+j + 1). 

It is often possible to obtain an upper bound for negative j, say j < —k(t)/2, 
directly. In such cases the above elementary renewal theoretic argument leads 
to a bound for the || • | loo-distance between the probability mass functions of 
Nt — k(t) and Zt, for example; note that the latter variable has distribution 
Qjjffj where Q v , < 7] < 1, is the set of limit distributions along subsequences 
that appears in Theorem 2. 

The above argument covers the step from (a~~ m S m ) m< =fq to (N t )t>o- How- 
ever, in an application the starting point will usually be the convergence of 
the scaled lifetimes in (1), which means that we also need an analogue for 
Lemma 1 that gives rates of convergence. 

We carry this out in the specific context of digital search trees. The fol- 
lowing general bounds will turn out to be useful: If X has density fx and if 
PQY\ < c) = 1, then 

(15) dKs(M + >0<c||/x||oo. 

Indeed: For all zeR, P(X < z - c) < P(X + Y<z) < P(X < z + c), so that 
\P(X + Y < z) - P(X < z)\ 

< max{P(X <z + c)- P(X < z),P(X < z) - P(X <z- c)}, 
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and, of course, P(X € (a, b]) < (b — a)\\fx\\oo- This bound can easily be 
generalized to 



where we still assume that X has density fx, but Y may be arbitrary. 
Note that X and Y need not be independent in (15) and (16). If they are 
independent then it is easy to show that 



In (17) boundedness of Y is not needed but the bound obviously makes sense 
only if Y has finite first moment. Finally, in connection with density bounds 
the interplay with convolution is of interest: We have ||/*<?||oo < ll/lloo for 
all probability densities /, g. For example, if a sum of independent random 
variables contains a summand with distribution Exp(A), then the density of 
the sum is bounded by A. 

Lemma 3. With (Vfc)fc£N an d Soo as in Section 3, 



Proof. Let (^fe)fceN be a sequence of independent random variables, all 
exponentially distributed with parameter 1. Then Sqo is equal in distribution 
to 2~ fe Zfc- We recall that the kth lifetime Y^ has a geometric distribu- 

tion with parameter 2~ fc+1 . On the basis of (Zk)k£N we define a sequence 
(Yfc)fceN by Yfc := [a k Z k \ + 1 for all fe G N, with 



(16) 



d KS (X,X + Y)<c\\f x \\oc + P(\Y\ >c) 



for all c > 



(17) 



d KS (X,X + Y)<\\f x \\ocE\Y 



d KS {2- n S n ,S oo ) = 0{n2- n ). 



Qi :=0 



a k :=(-log(l-2- fc+1 )) 



-i 



for k > 1. 



It is easy to check that 



n n 



{Yk)k£N =distr (^fe)fceN ) 2 "V2 Zk =distr 



k=l k=l 

Hence, with 4>{n) denoting the dKS-distance of 2~ n S n and S ( 
<j)(n) < (fri(ii) + 02 (n) + 03 (n) for all n £ N, 



OO 5 



with 01,02,03 defined by 
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For the random variables in (j>\ we have 



V n < 2~ n J2 Y k < V n + n2~ n with V n := 2~ n £ a k Z k . 

k=l k=l 

It is easy to show that the densities of V n , n G N, can be uniformly bounded 
for all n by some finite constant C\, hence (15) implies that 4>i(n) < C±n2~ n 
for all n G N. 

The elementary bounds 



1 



1 < 



1 



< 



1 



for < x < 



1 



x log(l — x) x 2 

together with a\ = imply sup fcGN \a k — 2 k ~ 1 \ = 1, hence we have 

n n n 

2-"5> fc Z fc -2-"]T2 fc - 1 Z fc <2-"^Z fc . 

k=l k=l k=l 

The familiar combination of Markov's inequality and moment generating 
functions gives 

p(j2^Z k >(l + K)n^j =0(2-™) 

if k is chosen large enough, so that we can use (16) with c = c(n) = (1 + 
K)n2~ n to obtain that 4>2( n ) < C2n2~ n for all n G N, for some finite constant 

c 2 . 

For 03 finally we use (17): For the densities of the finite sums we again 
have a finite uniform bound for all n, and 



E 



k=n+l 



£ 2- fc SZ fe = 2- re , 

fc=n+l 



so that 4>3(n) <C^2 n for all n G N with some C3 < 00. Putting these to- 
gether we arrive at 



<f>(n) < Cn2~ n 
with some finite constant C. □ 



for all n G N 



In the application under consideration we obtain a rate of convergence 
result with respect to the total variation distance, which is stronger than 
a result for the supremum norm distance of the corresponding probability 
mass functions that we mentioned in connection with (14). 

Theorem 4. With (X n (9)) neN and Q v as in Section 3, 

drv(£(X n (9) - Llog 2 nJ),Q{ log2n }) = o(n -7 ) for all 7 < 1. 
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Proof. We use the abbreviations k(n) := [log 2 nJ and 77(72) := {log 2 n}. 
Let 7 < 1 be given and choose e > such that e < 1 — 7. Lemma 3 together 
with (14) gives 

Y, \P(.N n -k(n)=j)-Q v(n) ({j})\<C Y ^ 

j>-ek(n) j>(l-e)k(n) 

for all n E N with some finite constant C. Our choice of e implies that the 
upper bound has the desired rate o(n~ 7 ). 

For the remaining part of the infinite sum in (8) we replace the absolute 
difference of the probabilities by their sum, which means that it is now 
enough to show that 

(18) P(N n <(l-e)k(n))=o(n^), 

(19) P(-log 2 (5 00 ) < -ek{n) + 1) = o(n^). 

Here we have used that Q n is the distribution of [— log 2 (5oo) + 77J . It is easy 
to show that the moment generating function for Soo exists in a neighbor- 
hood of 0, hence 

(20) P(5oo >x)= o{e~ KX ) for all x > 

with some k > 0. Straightforward manipulations show that (20) implies (19); 
indeed, the probability converges faster to than any negative power of n. 
Using once again the relation between the number of renewals and the partial 
sums of the lifetimes we further obtain, with m(n,e) := [(1 — e )k{n)\ , 

P{N n < (1 - e)k(n)) < P(S mM > n) 

< d KS (2- m(n ' £) S m(ni£) ,5 0O ) + P(5 oc > n2~ m ^). 

For the Kolmogorov-Smirnov distance we use Lemma 3, for the tail of Sqo 
the desired rate follows with (20). This gives (18) and hence completes the 
proof. □ 
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